Abstract: Anomaly detection is an important data mining task, aiming at the discovery of elements that show significant diversion from the expected behaviour; such elements are termed as outliers. In this paper studies the problem of outlier detection on continuous data streams. The proposed system EMCOD (Event based Micro Cluster-Based Continuous Outlier Detection)algorithms for continuous outlier monitoring on deterministic data streams based on the sliding window. In this paper, design efficient algorithms for continuous monitoring of distance-based outliers, in sliding windows over data streams, aiming at the elimination of the limitations of previously proposed SVDD algorithms. The primary concerns are efficiency improvement and storage consumption reduction. ?The proposed algorithms are based on an event-based framework that takes advantage of the expiration time of objects to avoid unnecessary computations. The EMCOD algorithm is an outlier detection method based on micro-cluster. Thesis technique is able to reduce the required storage overhead, run faster than previously proposed SVDD technique and offers significant flexibility. Experiments performed on real-life as well as synthetic data sets.
Keywords: Anomaly Detection, Outlier Detection, Distance Calculation, Cluster Outlier.